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ABSTRACT. We consider the problem of estimating the joint distribution function of the event 
time and a continuous mark variable when the event time is subject to interval censoring case 
1 and the continuous mark variable is only observed in case the event occurred before the time 
of inspection. The nonparametric maximum likelihood estimator in this model is known to 
be inconsistent. We study two alternative smooth estimators, based on the explicit (inverse) 
expression of the distribution function of interest in terms of the density of the observable 
vector. We derive the pointwise asymptotic distribution of both estimators. 
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1 Introduction 

To test the efficacy of a vaccine, preventative trials are held where participants are injected 
with the vaccine and tested for several times. One of the questions of interest in the trials is 
whether the efficacy depends on the genetic sequence of the exposing virus. To answer this 
question, Flynn et al. (2005)| studied the so-called viral distance between the HIV sequence 



represented in the vaccine and the HIV sequence the participant is infected with. This distance 
can be considered as a "mark" variable, since it can only be observed if infection has already 
taken place. This variable is possibly correlated with the time of HIV infection and according 
Gilbert et al. (2001)| it is natural to treat it as a continuous random variable. 



to 



A natural statistical model to describe the observations in these HIV vaccine trials is the 



interval censored continuous mark model, which was first studied by Hudgens et al. (2007) In 



this model, X is an event time (the time of HIV infection) and V is a continuous mark variable 



(the viral distance) and we are interested in the bivariate distribution function Fq of the pair 
{X,Y). However, the event time is subject to interval censoring case k. We restrict ourselves 
to the special instance of interval censoring case 1 (also known as current status censoring) and 
refer to this model as the current status continuous mark model. 

For this model, the method of nonparametric maximum likelihood estimation is studied 



Maathuis and Wellner (2008) There it is proved that the maximum likelihood estimator (MLE) 



is inconsistent. An approach they propose to 'repair' the inconsistency is by discretizing the 
mark variable. Discretizing the mark variable to K levels, the resulting observations can be 
viewed as observations from the current status iiT-competing risk model. The characterization, 
consistency and (local) asymptotic distribution theory of the MLE in that model follow from 



Groeneboom et al. (2008a, 2008b). Results on consistency and asymptotics as -ftT — )■ oo are not 



yet known. 

Another natural way to estimate the distribution function Fq is by viewing this problem as 
an inverse statistical model. In inverse models, like interval censoring models or deconvolution 
models, one is interested in estimating the distribution of a random variable X. Instead of 
observing this variable X directly, only a related variable W is observed. The distribution of W 
depends on the distribution function Fq of X (or its Lebesgue density /o) via a known (direct) 
relation. In some cases, this relation can be explicitly inverted to express Fq in terms of the 
distribution of W, and to estimate Fq one can plug in an estimator for the distribution of W 
in this inverse relation. The resulting estimator is called a plug-in inverse estimator. Plug-in 



inverse estimators are studied by Hall and Smith (1988) in Wicksell's corpuscle problem, by 



Stefanski and Carroll (1990) in the deconvolution model and by Burke (1988) in the bivariate 



right-censoring model. 

In this paper we study plug-in inverse estimators in the current status continuous mark 
model. We start with a formal description of the model and define two plug-in inverse estimators 
in Section [2] One estimator is based on univariate kernel smoothing, the other is based on 
bivariate kernel smoothing. In Section[3] we prove that these estimators are uniformly consistent 
for Fq. Unfortunately, these estimators are not monotonically increasing in both directions, 
which is a necessary property of bivariate distribution functions. In Section [3] we prove that 
the estimator based on bivariate kernel smoothing asymptotically will have all properties of 
a bivariate distribution function on a large subset of [0,cx))^. The plug-in inverse estimator 
resulting from the univariate kernel smoothing estimator is computationally and asymptotically 
more tractable. In Section |4] we first derive the asymptotic distribution of this estimator. After 
that, we prove that for certain choices of the smoothing parameter in the z-direction, the two 
plug-in inverse estimators are asymptotically equivalent, while for other choices the asymptotic 
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biases differ but the asymptotic variances are equal. This phenomenon was also observed 



by Marron and Padgett (1987) and Patil et al. (1994) in the case of estimating densities 



based on right-censored data and by Groeneboom et al. (2010) in the current status model 



The asymptotic distribution of the estimator based on bivariate kernel smoothing then follows 
easily. In Section [5] we briefly address the problem of estimating smooth functionals. A small 
simulation study to compare the estimators with the binned MLE studied by |Maathuis and| 



Wellner (2008) and the maximum smoothed likelihood estimator studied by Groeneboom et al. 



(2010)1 is performed in Section [6j Technical proofs and lemmas can be found in the Appendix. 



2 Definition of the estimators 

In this section we describe the current status continuous mark model in more detail and define 
two plug-in inverse estimators based on kernel smoothing. 

Let X be an event time, Y a continuous mark variable and Fq be the distribution function 
of the pair {X,Y). In the current status continuous mark model, instead of observing the pair 
(AT, y), we observe a censoring variable T, independent of (A, F) with Lebesgue density g, as 
well as the indicator variable A = 1{x<t}- bi case A < T, i.e. if A = 1, we also observe 
the mark variable Y; in case A = the variable Y is not observed. Under the assumption 
that P{Y = 0) = 0, we can represent the observable information on (A, Y) in the vector 
W = (T, Z,A), for Z = A-y. 

Let \i be Lebesgue-measure on IR', B the Borel cr-algebra on [0, oo)^ and define the measure 
A on S by 

\{B) = \2{B)+\i{{x e [0,c5o) : (x,0) £ B}), BeB. 

Then, the density of the observable vector W w.r.t. the product of this measure with counting 
measure on {0, 1} can be written as 

hF,{t, z, 5) = 5g{t)d2Fo{t, z) + (1 - &)g{t){l - Fo,x{t)) = 5hi{t, z) + (1 - 5)hQ{t), (1) 

where Fq^x is the marginal distribution of A and d2FQ(t,z) ~ J|Fo(t, z). More generally, for 
convenience of notation, we denote the jth partial derivative with respect to Xi of a function F 
by df F, i.e. 



dfF{xi,X2) = —F{yi,y2] 



and omit j when j = 1. 
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Based on the relation hi{t,z) = g{t)d2FQ(t, z), we can express the bivariate distribution 
function Fq of {X, Y) in terms of the (sub-)densities g and hi 

Fo{t,z) = ^ / hi{t,v)dv. (2) 



Then, our plug-in inverse estimator in the current status continuous mark model is defined as 

F{t,z) = ^ / K[t,v)dv, 
9[t) Jo 

where g and hi are estimators for g and hi, respectively. 

Before explicitly choosing the estimators g and hi , we introduce some notation. Throughout 
the paper k denotes a univariate kernel density, k a bivariate kernel density and (q;„) and 
vanishing sequences of positive smoothing parameters. Let ka^ and ka^^p^ the rescaled versions 
of k and fc, i.e., ka^{u) — a^^k{u/an) and ka„^i3^{u,v) = l3~^k{u/an,v/ /Sn). Furthermore, 
we define 



m2{k) — J u k(u) du, m2(fc) = J J Wik{wi,W2) dwid'W2- 

Then for fixed to and zq, we estimate g and hi by their respective univariate and bivariate 
kernel (sub-)density estimators 

9n{to) = - ^ ka^ih - 'h^^\{to, Zo) ^ - ^ A^ka„,l3„ito - Ti, Zq ~ Zi). 

" i=l " i=l 

The plug-in inverse estimator then becomes 

p{2)i, , N C i ELl Azfca„,ff„fa -T,,Z- Zj) dz 

[to,Zo) = ^-^=^ — . [6) 

Here, superscript 2 in the notation for the plug-in estimator refers to the fact that there is 
smoothing in two directions. 

In Section |4] we also consider a less natural, but computationally and asymptotically more 
tractable estimator using an estimate for the numerator J^" hi{tQ, z) dz based on smoothing 
only in the <-direction, i.e., when we estimate it by 



1 " 



n 

i=l 



The plug-in inverse estimator then becomes 

" ^ " = ^^Etik.Jto-T.) ■ 

Superscript 1 in the notation for this estimator refers to the fact that there is only smoothing 
in one direction. Note that if we take k{y) = (jZ]) results in 



Fi'Hto,zo) = 



IueAjz>o dMniu,z,6) ' 
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where ]H„ is the empirical distribution of the observations (Ti, Zi, Ai), . . . , (r„, Z„, A„) and 
An ~ An(to) — [to — Q!„, io + CK-n]- This estimator is the total number of observations Ti in An 
with Z- value smaller than or equal to zq and A = 1 divided by the total number of observations 
{Ti, Zi) in the strip An x [0, cxd). 

It is very natural to define the kernel density k in terms of the kernel density k as stated in 
assumption [KA): 

{KA) Let A: be a bivariate kernel density, then the kernel density k is defined as 

kiwi) — / k{wi,W2) dw2- 



Indeed, if {KA) holds the estimator Fi^'' also satisfies the inverse relation ho{t) = g{t){l 
FQ,x{t)) that follows from substituting (5 = in (nl. To see this, note that we have that 



n 1 1 

9n{to) - -Vfc„„(to-7;) = -V(l-A,)A:„„(to-T,) + -yA,/c„„(to-T,) 
n ^ — ' n -"^ — ' n ^ — ' 

i—l i—1 i—1 

n /" 1 " 

= - V(l- A,)fc„„(to-T^.)+ -y,\L^,fiAto-T„z~Z,), 



i=l 

If we define /i„,o(^o) — ^ Sr=i(l ~ ^j)^Q,i(^o ~ Ti) as an estimator for the sub-density Hq in 
0, then 

1 £-(2) /+ A 1 £^(2)u \ 1 j^hnA{h,z)dz gn{to) - hn.o{h) hnfi{to) 
l-^n,x(*0) - HrO,OOj - 1 — — = 1 — — - . . . 

Figure [l] illustrates the estimator F,!^-* for n = 10 and n — 100. For Fq we took the uniform 
distribution on [0,1]^ and for g the uniform distribution on [0,1]. As kernel density we used 
k{y) — The smoothing parameter a„ is taken to be 0.65 for n = 10 and 0.40 for 

n = 100. These values are chosen for illustrative purpose only and do not depend on the data. 
In Section [7] we briefiy address the problem of choosing Q!„ and /3„ depending on the data. 

[Figure [l] here] 

Note that these estimators are not true bivariate distribution functions, as they decrease locally 
in the x-direction. Monotonicity of a bivariate function is a necessary (but not sufficient) 
condition in order to be a bivariate distribution function, hence these estimators can be seen 

" (2) 

as naive estimators. The estimator Fn can also have this undesirable naive behavior. 



3 Consistency and monotonicity 

In this section we prove that the estimators Pji'^ and -Fil^^ are uniformly consistent. Further- 

^ (2) 

more, we prove that for appropriate choices of the bandwidths and n sufficiently large, Fn will 
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have all properties of a bivariate distribution function on a large subset of [0,oo)^, with arbi- 

^ (2) 

trarily high probability. To derive these results for Fn , we assume the distribution function 
of interest Fq and the censoring density g satisfy the following conditions. 

(F.l) The Lebesgue density /o of Fq exists for all {t, z) e [0, oo)^. 

(G. 1) Let 5q X denote the interior of the support of the marginal density fo x of X. On Sq x , the 
density g satisfies < g < oo and its derivative g' is uniformly continuous and bounded. 

We also impose some conditions on the kernel densities k and k, as well as a condition on 
the smoothing parameters a„ and /?„. 

{K.2) The kernel density k has compact support [—1, 1], is continuous and symmetric around 0. 
{K.3) The kernel density k has compact support [—1, 1]^, is continuous and satisfies 

ik{'Wi,W2) dwi dw2 — {i = // W2k{wi,W2) dwi dw2 — // w^k{wi,W2) dwi dw2 



//• 



(C.l) The positive smoothing parameters and /3„ satisfy 

lim a„ = lim /3„ = 0, lim na„ = oo. 

A possible choice for the bivariate kernel density k is the product kernel density k{x,y) = 
ki{x)k2{y) for univariate kernel densities ki and k2 with compact support [—1,1] that are 
continuous and symmetric around 0. This kernel density k satisfies condition {K.\) for k — ki 
and (if. 3) if TO2(fci) — TO2(A;2). 

Theorem 1 Assume Fq and g satisfy conditions {F.\) and (G.l). Also assume k is defined 
via relation (K.l) and satisfies condition (K.2). Furthermore, let a„ and /3„ satisfy condition 
(G.l). Let A C IR^ be a compact set such that g{t) > c > for all {t,z) G A Then F^^ and 
Fn are uniformly consistent on A. 



Proof: The uniform consistency of F,!^'' follows from Theorem 3.2 in Hardle et al. (1988) 



To prove that Fn is uniformly consistent on A, first note that for n sufficiently large there 



exists e > such that 



see also Lemma [HI Hence 



I " f2') I 

sup \h\[{t,z) - hi{t,z)\ < 

(t,z)G-4 



/" 

Ja 



hnA{t,y)dy'- I hi{t,y)dy 







< I \h^nlit,y)-hi{t,y)\dy<ez. 
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Since z A and A is compact, this imphes that 

hn\ii^y)dy- I hi{t,y)dy 



Write Ni^\t, z) = h\^!i{t, y) dy and N{t, z) = hi{t, y) dy. Then we have that 



sup 

(t,z)e^ 



(5) 



\F^^\t,z)-Fo{t,z)\ = 



Ni^\t,z) N{t,z) 



< 



9n{t) g{t) 
Ni^\t,z)^N{t,z) 



git) 



Ni^\t,z) Nk'>{t,z) 



(2). 



9n{t) 



-L^\Ni^){t,z)-N{t,z)\+N,^;'\t,z) 



git) 
1 



gnit) git) ■ 

The first term converges uniformly to zero in probabihty over ^ by ([5|. The second term 
converges uniformly to zero in probability by Lemmajsj and uniform consistency of i^i^' follows. 

□ 



Each bivariate distribution function F has to satisfy 

Vxi < a;2,j/i < 2/2 : F{x2,y2) - Fix2,yi) - F{xi,y2) + F{xi,yi) > 0. 



(6) 



This condition requires that each rectangle [xi, X2] x [yi, 2/2] has a nonnegative mass and suggests 
that some shape constraints on _Fo are imposed by the model. However, in Theorem [2] below, 
we prove that it is not necessary to use this shape constraint to estimate Fq since the estimator 
Pi^'^ satisfies condition ^ asymptotically. To prove this, we prove that the Lebesgue density 
f^'^ is positive, with probability converging to one. The estimator F^^'^ does not have a density 
w.r.t. Lebesgue measure A2, hence a similar result can not be proved in this way for Pi^K To 
prove Theorem [2] we need stronger conditions on q;„ and (3„ than condition (C.l). 

(C.2) The smoothing parameters a„ and /3„ converge to zero as n — >■ cx) and satisfy 

lim = 00, lim na^/3„ = 00. 

Note that sequences Q!„ and /3„ satisfying condition (C.2) also satisfy condition (C.l) and 
na'^ — >■ 00. 

Theorem 2 Assume Fq and g satisfy conditions iF.l) and (C.l). Also assume k and k 
satisfy conditions iK.2) and {K.3). In addition, assume k' and dik are uniformly continu- 
ous. Furthermore, let a„ and /3„ satisfy condition (C.2). Let S C [0, cx))^ be compact and 
such that /o is uniformly continuous on an open subset that contains S and for all 6 > 0, 
Ms = {it,z) e [0,cx))2 : foit,z) > 26} nS. Then for S > 0, 



P\V{t,z)eMs : f^?\t,z)>^5 



(7) 
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where fn^ is the Lebesgue density of Fn^^ and Ig and Ug are as defined in Lemma^ 



Proof: Fix 5 > 0. First note that since 



dtdz Jq 



U^l{t,v)dv = d^U^\{t,z) 



we have the following expression for fli^ 



dudv 



{u,v) = {t,z) gn{t) 



We first consider the numerator and prove that 

P(V (t, z)eMs : .9„(t)9i/ii'l(t, z) - g'Mh^^\{t. ^) > 2Z^<^) ^ 1. (9) 

For this, note that for all (i, z) ^ Ms 

g„{t)d,h'^^\{t, z) - g'Jt)h^n\{i, z) - gn{t){d,U^\{t. z) ~ d,h,it, z)) + U^\{t, z){g'{t) - g'^{t)) 
+dihi{t, z){gn{t) ~ g{t)) + g'{t){hi{t, z) ~ U^l{t, z)) + g{t)dihi{t, z) - g'{t)hi{t, z) 
> - sup gn{t) sup \dih^^\{t, z) - dihi{t, z)\ 

teprojxMs it,z)eMs 

- sup h''^\{t,z) sup Ig' (t) - g'„{t)\ 

{t,z)eMs teprojxMs 

- sup dihi{t,z) sup \gn{t) - g{t)\ 

{t,z)eMs teprojxMs 

- sup g'{t) sup \hi{t,z) -h^^^^{t,z)\+ g{t)dihi{t,z) - g'{t)hi{t,z), 

teprojxMs {t,z)eMs 

with projx-Ms = {t : {t, z) £ A4s for some z}. By Lemmajsjall random terms converge to zero 
in probability. Since g{t)dihi{t, z) — g' {t)hi(t, z) — g{t)'^ fQ{t, z) we have that the last term is 
bounded below by inf git)^foit,z) > 2/2^ by Lemma g 

By Lemma [t] and the uniform consistency of gn [see Lemma [s], we have that < ^Ig < 
gn{t) < 'i.Ug < oo for all t G projxMs with probability converging to one. This implies that 
for ah {t, z) e Ms 

with probability converging to one. Hence ([7| follows. □ 

Remark. If, in addition to condition (F.l), we assume that /o is uniformly continuous on 
[0,oo)2, this theorem implies that for each 6 > and M > 0, the restriction of Pji^^ to the set 
{{t,z) e [0, Af]2 : fo{t,z) > (5} will asymptotically be the restriction to this set of a bivariate 
distribution function Fn on [0,oo)2. 
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4 Asymptotic distributions 

In this section we derive the asymptotic distribution of both plug-in inverse estimators. Al- 
though the estimator Pji^^ is more natural, we start with the estimator Pji'^ since deriving 
its asymptotic distribution is easier. Subsequently, we prove that for certain choices of the 
smoothing parameter /3„ the estimators Pji''^ and Fn'^'^ are asymptotically equivalent, yielding 
the asymptotic distribution of Fn . 

Theorem 3 Assume Fq and g satisfy conditions (F.l) and (G.l). Also assume k satisfies 
condition {K.2). Fix to^ZQ > such that dlFoit^z) and g" [t) exist and are continuous in a 
neighborhood of {tQ, Zo) andt^, respectively, and c)^i^o(^Oi ^o) + 2(?'(to)<9ii^o(toi ^o)/5(io) 7^0 and 
gito) > 0. Then for Un = cn~^^^ , 

n2/5(i^,W(to, zo) - Foito, zo)) ^f{^iu<J^) 



1 2 /;^^a2E^^. \ , r,9 ito)diFo{to,zo) , , , 
fJ-1 = -c m2{k) idiFQ{to,ZQ) + 2 — } , (10) 

2 I g(to) 



2 _ _i -Fo(ioi 2o) (l — -Fb(io, 2o)) / , / n2 



where 



5(^o) J 

Remark. In case diFo{to, zq) + 2g' {to)diFo{to, zo) / g{t()) = 0, the rate of convergence changes 
because the bias is of a different asymptotic order. This is in line with results for other kernel 
smoothers in case of vanishing first order bias terms. 

The proof of this theorem, a combination of the Lindeberg-Feller Central Limit Theorem 
and the Delta-method, is given in the Appendix. 

To illustrate the pointwise asymptotic results we simulate m — 1 000 times a sample of size 
n — 5 000, using Fo{x,y) — \xy[x + y) for x,y ^ [0, 1] and g{t) = 2t for t £ [0, 1]. For each sam- 
ple we determine the estimator (0.5, 0.5) (using kernel density fc(y) — |(1 — i/^)l[-i,i](y) and 
smoothing parameter q;„ = 0.09) and the resulting value of n^/^(F^^-'(0.5, 0.5) — Fo(0.5, 0.5)) . 
Figure [2] shows these m values, in a QQ-plot (with the line y = /ii +xa) as well as in a histogram 



(with the J\f{fii,(j'^) density). Here /ii and a are as defined in (10 1 and (11) for this Fq and g. 

[Figure [2] here] 

Under definition {K.l) and assumptions {K.2) and {K.3) on the kernel densities k and k, we 
can prove that for to, zo>0 fixed n^/^{Fi^\to, zo)-Fi^'(to, ^0)) converges to zero in probability 
whenever /3„ converges faster to zero than n^^^^. As a consequence, these estimators are (first 
order) asymptotically equivalent. For /3„ tending to zero slower than n^^^^, n^/^\Fn^\tQ, zq) — 
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Fn'\to, zo) \ — >■ oo in probability. These results are more precisely stated in Theorem |4] and 
Corollary [5] 

Theorem 4 Assume Fq and g satisfy conditions (F.l) and (G.l). Also assume k and k satisfy 
conditions {K.2) and {K.3). Fix to, zq > such that 9|i^o(^: z) and g{t) exist and are continuous 
in a neighborhood of {to,ZQ) and tQ, respectively, and c)|i^o(^Oi ^o) 7^ and g{tQ) ^ 0. Let 
a„ = Cin^^^^ and [3^ — C2n^^ , then for 13 > 1/5 

n'/^Fi'\to,zo)-Fi^\to,zo)) ^0, 

for 1/5 

n^"'{Fi^\to,z^)-Fi^\to,z^)) A ic2m2(fc)a|Fo(to,^o) 

whUefor 13 <\/b n'/'>\F^n\to, zq) - Fi^\to, zo)\ A oo. 

The proof of this theorem is given in the Appendix. 

As a consequence of this theorem, the estimators F^^ and F^'' are pointwise asymptotically 
equivalent for /3 > 1/5, while for [3 = 1/5, Fn (io,2o) has an additional (possibly negative) 
asymptotic bias term. 

Corollary 5 In addition to the conditions of Theorem^ assume 9|i^o(io7^o) 7^ and g{tQ) ^ 
0. Let a„ — cin^^/^ and (3n = C2n^^ . Then for (3 > 1/5 

n^^^Fi^Hto, zo) - Foito, zq)) J^il^u^^) 



where fii and a are defined in {10) and (11) (with c = Ci). For (3 = 1/5 



n^/^{Fi^\to, zo) - Foito, zo)) N^i^i2,<y^), 

where 

M2 = Ml + ^clm2{k)d2Fo{to, zq). (12) 

Proof: This immediately follows from Theorem |4j □ 

Figure [3] shows the values of n^/^(Fi^^(0.5, 0.5) — Fn^\o.5,0.5)) as a function of n with 
o^n = /?„ = irt^^/"^. The solid lines are the lines ±in^^/^, the order of the standard 

deviation of n'^^^ {PP {0.5, 0.5) - Fi^\o.5, 0.5)) (see the proof of Theorem [i] in the appendix). 
For Fo and g we used the same setting as in Figure [2[ for k we used k{x,y) — k{x)k{y) the 
product kernel density with k{u) = |(1 — u^) for u e [—1, 1]. 
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[Figure [s] here] 

Figure [i] shows to = 1 000 values of n'^/^ {Fi^\o.5,0.5) - Fo(0.5,0.5)) for n = 5 000, a„ = 
in^^/^ and /3„ = ^n^^^^, in a QQ-plot (with the line y = fii + xa) as well as in a histogram 



(with the J\f{iii, a^) density). Here and a are as defined in ( 10 ) and ( 11 1 for Fq, g and k the 
same as in Figure [3j 

[Figure |4] here] 

5 Smooth functionals 

It is well known that in the current status model certain functionals of the model can be 



estimated at ^Jn rate, although the pointwise estimation rate is lower, see, e.g., Groeneboom 



(1996)1 In the continuous marks model we have a similar situation and we briefly sketch how 



the theory of smooth functionals applies here. In the "hidden space" one would be allowed 
to observe the random variable [X, Y) with distribution function _F, and the so-called score 
operator from functions on the hidden space to functions on the observation space is in this 
case given by 

[LF{a)]{t,z,5) = E{a(X,y)|(T,Z,A) = (t,z,5)} 

5f^a{x,z)dF,{x) {l-S)JZJ^=oa{x,y)dFy{x)dy 

where F^lx) = d2F{x,z) = -^F{x,z). Note that the F^ correspond to the component sub- 
distribution functions in the model with finitely many competing risks and that F{t, oo) = 
FAt)dz. Here Lp is a mapping from L^{F) to L2{H)^ where L2{F) denotes the space of 
square integrable functions a with zero expectation, i.e. 

EpaiX^Y) = J a{x,y)dF{x,y) = 0, Ep a{X,Yf = J a{x,yf dF{x,y) < oo. (13) 

Similarly, L^{H) is the space of functions b with the properties: 

Eh b{T, Z,A)^ J b{t, z, S) dH{t, z, 5) = 0, b{T, Z, A)^ ^ J b{t, z, 5f dH{t, z, S) < oo. 

Using the first relation in ( [l3| we get: 

SJ^ a{x, z) dF^ix) (1 - S) Co «(^' y) dFyix) dy 



[LF{a)]{t,z,S) 



F,{t) l-F(t,oo) 
5 a{x, z) dF, jx) (1 - S) a{x, y) dFyjx) dy 

F,{t) 1 -F{t, oo) 
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We now consider the adjoint of Lp, mapping the functions b e L^iH) back into L2{F). The 
adjoint is given by: 

POO pX 

[LUb)]{x,y)= b{t,y,l)dG{t)+ b{t,0,Q) dG{t). 



This is analogous to what we get in the current status model, see e.g., Groeneboom (1996) 



In order to make this somewhat more concrete, we consider the functional 

= J xdFo^x{x)= J xdF{x, oo). (14) 
Then the score function in the hidden space is: 

a{x,y) = X — J X dF{x, oo) ~ x — J J u dF^ (w) dw, 
so only depends on the first argument, and we have to solve the equation 



h{t,zA)dG{t) + J h{t,Q,Q)dG{t)=x- jj udF^{u)dw, 

where b has to be in the (closure of the) range of the score operator, so this would be 

5Sla{x,z)dF^{x) {l-5)jl^j;:^a{x,y)dFy{x)dy 
^^'^ - FAT) l-f(t,oo) ' ^^"^'^ 

if b is in the range itself (and not only its closure). We therefore consider the equation: 

Sl=,a{u,z)dFAu) ^^^^^ r !LoSr=^<^^y^dFy{u)dy rr 

— — dG{t) - / ^- — dG{t) = a; - / / udF^{u) dw. 

t=x Fz{t) Jt^o 1- F{t, oo) J J 

Differentiation w.r.t. x yields: 

£lo a{u, z) dF,{u) Co y) dPyiu) dy _ i 



F,{x) 1-F{x,(x) g{xy 

Letting (/>(a;, z) = /j^_q a(w, z) dFz{u), this is solved by taking 

FAx){l-F{x,^)) 

4>(x, z) = — . 

9ix) 

So we get 

SFAt){l - Fit,oo)) (1 -S){1- Fit,^)) !^^^Fy{t)dy 



b{t,z,6) 



Fz{t)git) {l~F{t,^))g{t) 
S{l-F{t,oo)) {1 -S)F{t, oo) 



m 9{t) 

implying that the efficient asymptotic variance for estimating the mean functional ^p, defined 



by ( 14 1, is given by: 

F{t, oo){l - F{t, oo)) 



J bit. z, S)^ dHit, z.6)^J (15) 



12 



which (not surprisingly) is the same expression as one gets in the current status modeL 

The next question becomes whether taking J x dFn{x,oo), where Fn is one of our proposed 
estimators, wih lead to an efficient estimate of fip, in the sense that it converges at rate -^/nj 



with an asymptotic variance which attains the information lower bound (15 1. 

Let us consider the estimator, defined by Q, and more specifically, the estimator obtained 
by taking k{y) = ^l[_i i]{y). Then (Ml) becomes 



where ]H„ is the empirical distribution of the sample Wi , ■ ■ ■ , Wn . Also assume that / has 
compact support, say [0, 1]^, as in the setting of Figure[2] Then we get as the estimate of Fq x- 

To see whether this estimator leads to an efficient estimate of ^p, we have to perform a 
bias-variance analysis. We first consider the bias. Let Fa^ be defined by 

/«e[x-a„,.+a„], y>0 dHpoiu, ?/, ' 

where Hp^, is the distribution function of (T, Z, A) in the observation space. Then 
■dF^A^)^ {l^F^Jx))dx^ /° ', dx 



" Jo Iuelx-a„.x+a„],y>adHpo{u,y,S) 



u=-a„ Jxe[0,u+a„] !^^2" g{v) dv J u=a„ J xe[u-a„,u+a„] !^_2" di^) dv 



'• g(z.)(l-Fo(u,l)) ^ 

+ _ LLtAj LL til 



/n=l-a„ Jxe[u-a„,l] g{v) dv 

We have, if g is twice continuously diffcrentiable and stays away from zero on [0, 1] 



dx ^ / rs 1 7"^ — dx 



a;G[«-Q„,«+a,.l G{x + a„) - G{x - «„) 7x6 [«-q„ ,«+a„] 2a„g(a;) + y"{x)al 



and hence 



xelu-a„.u+a„] 2a„g(x)(l + 0{al)) g{u) 



r f g{u){l-Fo{u,l)) ^ p ^^^^ _unr 2^ 

/ / rx+a„ , dxdu^ / (1-Fo(u,l))dw + O(a„) 

Ju=-Q„ Ja;e[0,«+a„l qiv) dv J u=0 



E[o,«+a„] J^^2^g{v)dv Ju=0 

We also have 

g(u)(l-Fo(u,l)) 
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x+a„ dxdu= I {l-Foiu,l))du + 0{al) 



and similarly 

' 9{u){l~Fo{u,l)) 



dxdu^ / (1- FQ{u,l))du + 0{al) . 



So we obtain 

(1-F„„(a;))dx= / {1- Fo{x,l))dx + O{al) . (16) 
Jo 

Empirical process methods give us 

{Fi'^ {x,l)-F^^ (x)) dx - Op . (17) 

So (16 1 and (17) give us that, if (for example) q;„ is of order n^^^^, 

(fW (x, 1) - Fo{x, 1)) dx = Op . (18) 

Note that this does not follow if a„ is of order n^^^^, since the bias term is too large in that 
case! 

For the asymptotic variance, one has to analyze: 

f', ^ ,dH„(u,0,0) f^r _^ .dHFJu,0,0) ] 



dx 



x=0 \ lue[x-a„,x+a„],ye[0,l] '^H„(w, y, S) Ju^[^_a^^^^a„],ye[0,l] dHpoiUjU, S) 

which can be written as 

' Lg[.-.„,.+.„]^(H«~gFo)(^^,0,0 ) 

x=0 Iue[x-a„,x+a„], yelO,l] d^niu, y, S) 



' Le[x-c.,,,x+a„].yelOA]diT^n - Hp,) /„g rfgfp (», 0, 0) 

^=0 lue[x^a^,x+a„],ye[0,l] /«£ , [0,1] '^^ Foi"^^ V ^ ^) 

' Foi^^ 1) d (H„ - g^J {u, 0, 0) 

=0 '2g{x)an 

' (1 - J-0(a^, 1)) /.,[.-.„,.+.„l,,g(0,ll (Hn - gFo) (^, 1) 

a;=o 25(x)a„ 
- [ ^^^d(H„-ff^J(^,0,0)- / l^M^d(H„-i/^^J(«,y,l). 

So the asymptotic variance is given by: 



V 2 



di/i^o (m, 0, 0) + / ^ — ^ — '— dHpo (u, y, 1) 

1 Fo(u, 1)2(1 - Fo(w, 1)) ^ , /•! (l-Fo(w,l))Vo(u,l) 



9{u) Jo g{u 

1 Fo(u, 1)(1 - Fo(ii, 1)) ^ /•! i^o(u,oo)(l-Fo(u,(^)) 



Sl'^) Jo 9{u) 

The conclusion is that in this example, our estimator of fip converges at rate y'n and that its 
asymptotic variance attains the information lower bound, provided the bandwidth an tends to 
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zero faster than n~^/^. It also illustrates that a bandwidth of order n~^/^, which is an obvious 
choice for the pointwise estimation, is not suitable if we want to estimate smooth functionals, a 
phenomenon that seems (more or less) well known. Similar analyses can be performed for other 
smooth functionals, but since the local estimation problem is the main focus of our paper, we 
will not pursue this further here. 



6 Simulation study 

The estimators F^i'^ and Fn'' are asymptotically equivalent for sufficiently small choices of the 
smoothing parameter /3„. To get some insight in the finite sample differences between the esti- 
mators, we run a simulation study. We simulated data according to ^0(2^1 v) = \xy{x + y) for 
X, y e [0, 1] and g{t) = 2t for t e [0, 1] for different sample sizes n = 500, n = 1000, n = 5 000 
and 71 = 10 000. For each simulation we computed the estimators Fn^^ {to, zq) and Fn'^^ {to, zq) for 
two different values of (fg, zq) and different values of the smoothing parameters a„ and /3„. We 
repeated this B = 250 times, resufiing in 250 estimates F^^^^^^ i^^ {to, zq), Fl'^al,i3„ (*o, -zo), • ■ • , ^i*Q^X ^0) 
(i = 1, 2) for each value of the smoothing parameters q;„ and Then, we estimated the Mean 
Squared Error (MSE) of the estimator Fn\to,zo) by 

BT.{^n%.f>Sto,zo)-Foito,zo)) . 

Table [l] shows the minimum value of the estimated MSE for each estimator, for each n 
and in two different points (to,^o)- It slso shows the values of the smoothing parameters a„ 
and /3„ that yielded this value. The standard error of the mean of the squared differences 
i^n at /3„(*0) Zq) — Fo{to, zq))^ 3.16 givcu in brackets. The binned MLE F„ studied by 



Maathuis 



and Wellner (2008) and the Maxinmm Smoothed Likehhood Estimator (MSLE) F,f^ studied 



by Groeneboom et al. (2010) are included in this simulation study. 

[Table [l] here] 

Figiire [s] shows the resulting values of estimated MSEs as function of a„. For Fn, the 
smoothing parameter a„ is the binwidth in ^-direction, for F^'^^ we have that Q!„ and /3„ 
are the bindwidths in t- and z-direction, respectively. Both i^i^' and F^'^^ depend on two 
smoothing parameters, and we fixed the value of /3„ to be equal to that value that yielded 
the overall minimal estimated MSEs of the estimators. Determining the optimal value(s) of 
the smoothing parameter(s) for _F„ and Fj^'^^ was a bit tedious; the estimated MSE of Fn was 
very wiggly, the estimated MSE of F^^^ is only nicely [/-shaped for bigger values of n due 
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to computational issues. Although we choose the values of and /3„ also as the minimizing 
binwidths of the estimated MSEs, these choices might not be good estimates. 

[Figure [s] here] 

This simulation study, of which only some results are illustrated in Figure [s] for Fn^'^ and 

^ (2") 

Fn only, shows that the estimated MSEs of both plug-in inverse estimators are almost equal. 
Based on the estimated MSEs and the standard errors of the mean of the squared differences 
between the estimators Fn^"^ and Pi^"^ and the true distribution function, confidence intervals 
can be computed. The intervals for FrP and i^^^"* have non-empty intersections, implying that 
for this specific example there is no significant finite sample difference between the smooth 
plug-in inverse estimators. 



7 Bandwidth selection in practice 

The estimators F^^'^ and F^"^ depend on smoothing parameters a„ and /3„ (only Fn^). As 
with usual kernel density estimators, the estimators are quite sensitive to the choice of the 
smoothing parameters. Small values of a„ and /3„ will result in wiggly estimators reflecting the 
high variance, whereas big values of q;„ and /3„ will give smooth stable, but biased, estimators. 
One way to obtain good smoothing parameters that depend on the data is via the smoothed 
bootstrap. 

The focus of this paper is on the pointwise asymptotic behavior of the estimators Fn^ and 
Fn , SO also the choice of and /3„ is only considered locally at the point {to,zo). The 
smoothed bootstrap differs from the empirical bootstrap in the distribution it samples from. 
In the empirical bootstrap one samples from the empirical distribution function of the data, 
whereas in the smoothed bootstrap one samples from a usually slightly oversmoothed estimator 
for the observation density hp^. 

We now describe this method more specifically in our model. Let gn. ag =■ 5o a-nd F^ =: 
_fo be the kernel estimator and the smooth plug-in inverse estimator for g and Fq, respectively, 
with smoothing parameters uq and /3o. Then, (Xl'^ ,Y^'^), {X2'^ ,¥2'^), . . . , (X*'^ ,Y*'^) are 
sampled from Fo, T;^\ T2 r;;'! from go independently of {X*',Y*'^). The variables A*'^ 

and Z*'^ are defined as 1{x*-^<t*'^} ^^^^ ^i*'^ ' ^i'^ ^ respectively. The estimators F^^,a^^i and 

^ f2) 

■^n a p 1 ^'^^ determined at the point (to,-2o) for several values of a„ and /3„ based on the 
sample (Ti*'\ Aj^^ ),..., (r^^ \ A*'i). Note that now we make the dependence of the 
estimators on q;„ and /3„ explicit in the notation of the estimators. Actually, we only need the 
precise values for those observations (T^*'^, Z*'^, A*'^) that fall in [to — ct-mto + x [0, zq + 
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/?„] X {0, 1}, the precise values of T*'^ for those observations that fall in [tg — a„, io + a„] x (zq + 
Pm oo] X {1} and the numbers of observations in the various regions outside these areas (rather 
than their exact locations) to compute fI^]^^ ^o) and F^]^ ^ ^{t^, zq). Hence, only on this 
strip monotonicity of F^ is needed as well as positivity of ^'//q^.i(oo, oo) — ^^^^^.1(^0 + Q^n, 00) 

The procedure described above is repeated B times resulting in B estimators f'^^^^ 1, • ■ • , fIi^ b 
and -Fi'I„,/3„,i, • • • , i^S„,/3„,B- Then, the MSEs of f!^!1^ {to, zo) and F^^^l^^^^ {to, zo) can be es- 
timated by 

B 



MSE^'\a„;to,zo) = ^Y.[^nLA^o,zo) ^ Fo{to,zo))\ 

b=l 

MSE («„,/?„; to, 20) = ;^E(^nS„./3„,fc(^o,2o)-^o(to,^o) 



-(1) ' -(2) 

Then, choose those values of a„ and /3„ that minimize MSE (a„; to, zo) and MSE {an, Pn', to, zo) 
as smoothing parameters for the estimators FrP {to, zq) and Fji^\to, zq), respectively. 

Figure [6] shows the estimated MSEs for a small simulation study. In this study, we took 
n = 100, B — 500, ao — Po — 0.4, to = zo = 0.5 and Fo and g as in Section [6j It also shows 

MSE^\an;to,zo) = ^ (-F'i'i„,fc(io, ^0) - ^0(^0, ^0)) ' , 

b=l 

MSE^\an,Pn;to,zo) - ;^Il(^S„,/j„,fc(^o,2:o)-^^o(io,^o))', 

b=l 

-(2) --^'(2) 

as function of a„. For MSE and MSE it only shows the estimates for that value of /3„ 
that has the smallest estimated MSE. 

[Figure [g] here] 

There are other methods to obtain data-dependent bandwidths, for example via cross- 



validation ( Rudemo 1982 ) . Usually in cross-validation methods a global risk measure is mini- 
mized (like the Integrated MSE), hence its minimizer can be used as a global optimal bandwidth. 



8 Concluding remarks 

In this paper we consider two plug-in inverse estimators for the distribution function of the 
vector {X, Y) in the current status continuous mark model. The first estimator Fn"^ is shown to 
be consistent and pointwise asymptotically normally distributed. However, Fn^ does not have a 
Lebesgue density, since it only puts mass on the lines [0, 00) x {Zi\ with Zi > for i = I, . . . ,n. 
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The second estimator, Fn , does have a Lebesgue density. For a range of possible choices of 
the bandwidths a„ and /?„ we estabhsh consistency of this estimator. Taking a„ = n^^/^ and 
/3„ = , we prove that asymptotically for j3 < 3/10 the Lebesgue density of i^^^-* is positive 
on a region where /o is positive which stays away from the boundary of its support. This means 

" (2) 

that, although for finite sample size n the estimator Fn need not be a bivariate distribution 
function, "isotonisation" of it is not necessary asymptotically. Put differently, any common 
shape regularized version of our estimator is asymptotically equivalent with our estimator. 
However, this only holds asymptotically, and for finite sample size n it might be desirable to 
have an estimator which is a true bivariate distribution function, satisfying condition ([6]). For 
example when one wants to sample in a smoothed bootstrap procedure. Furthermore, we prove 

^ (2) 

that Fn is asymptotically normally distributed for /3 > 1/5. Hence, for (3 e [1/5,3/10), the 
estimator Fn asymptotically behaves as a distribution function with pointwise normal limiting 
distribution on a large subset of [0,oo)^. 
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A Technical lemmas and proofs 

Lemma 6 Assume that Fq and g satisfy conditions (F.l) and (G.l). Let S and Ms be as 
defined in Theorem]!^ Then projxMs — {t '■ it,z) G A4s for some z} is a closed subset of 



Proof: Fix S > 0. To prove that projxMs is a closed subset of Sq x we prove that 
(i) Ms is closed in [0, oo)^, 
(ii) projxMs is closed in [0,oo), 
{Hi) projxMs is a subset of Sq 

We now start with proving (i). By definition of S, there exists an open set U Z) S on which /o 
in uniformly continuous. Define 



Ms is the intersection of two closed sets, hence closed itself. 

For proving (ii), assume projxMs is not closed. Then, there exists a sequence G 
projxMs with x„ — > x ^ projxMs- By (i), the set Ms is closed, hence by definition of 
pfojxMs there exists a sequence (a;„,y„)„ G Ms- By compactness of Ms (this follows from 
(i)), there exists a subsequence {nk)^ and {x,y) G Ms such that (x„j.,?/„^)fc — ?> {x,y)- From 
this it follows that x G projxMs- This yields a contradiction, hence projxMs is closed. 

To prove (Hi), first note that by uniform continuity of /o on Ms 

3 77 > such that V {t,z),{s,y) G Ms : \\{t,z) - (s,y)|| < 77 =^ |.fo(<,2) - /o(s,y)| < S. 

Now take t G projxMs- Then for all s in a small neighborhood of t and Zg > such that 
(s, Zs) G Ms 



o,x- 



Us^{{t,z)eU : fo{t,z)>2S}^ fQ^[[2S,^)]. 



The function /o is continuous on U, hence Us is closed. Since we also have that 



Ms = {{t,z) e [0,00)^ : /o(t,z) >2(5}n5=%n5. 




hence t G 



□ 
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Lemma 7 Assume that Fq and g satisfy conditions {F.l) and (G.l). Let S and AA^ be as 
defined in Theorem^ Then 

3 Ig > 0, Ug < oo : Ig < Q (t) < Ug foT ttll t G projxM^s- (19) 

Proof: The set projx-Ms is a closed subset of Sq x by Lemma [6j On Sqx, we have that 



< g < oo, hence also on projxMs- Now assume (19) does not hold. Then, there exists a 



sequence (t„)„ — > i G projx-Ms such that 5(^,1) — > 0. By the uniform continuity of g, this 
implies that g(t) = 0, yielding a contradiction. 



The existence of Ug follows immediately from (G.l) 



□ 



Lemma 8 Assume Fq and g satisfy conditions (FA) and (G.l). Also assume k and k satisfy 
conditions {K.2) and {K.3). In addition, assume k' and dik are uniformly continuous. Fur- 
thermore, let Un and /3„ satisfy condition (G.2). Let S C [0, 00)^ be compact and such that /o is 
uniformly continuous on an open subset that contains S and define ||/||5,oo = sup^^ j/)e5 l/('^' y)\ 
and Sx ~ pfojxS. Then 



9\\sx, 



-^0, |1.9^j - .g'llsx^oo ^ 



l^i'l - h4s,oo A 0, \\dihl^\ - dihi\\s,oo A 



(20) 
(21) 



Proof: The results in (20) follow directly from Theorem A and C in Silverman (1978) The 



first result in (21 ) follows from Theorem 3.3 in Cacoullos (1964) By Theorem 3 in Mokkadem 



et al. (2005) 



lim (nalPri) ^ log P(||ai/i„ - |U,oo > S) = -c, 
for some constant c > only depending on S and dihf^y Hence, for n sufficiently large 
P{\\diK-dihf,\\s,oo>6) < 2e-"""'^"^^0. 



The results in Cacoullos (1964) and Mokkadem et al. (2005) hold for density estimators 



■ (2) 

whereas the estimator /i„ [ is a sub-density. However, the results in (21 1 follow from these after 



defining a binomially distributed sample size A^i and reason similarly as the proof of (A. 3) in 
Groeneboom et al. (2010)[ □ 



Proof of Theorem^ Define 



Y, 




-3/5 



ka„ {to ~ 
^0,zo]{Zt)^ikar,{t0 - Ti) 
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By the assumptions on Fq and g and condition (K.2), we have 



Er,.2 - n'^/^Faito,za)g{to) + ^n-'c^m2ik)dl{Foito, zo)gito)} + 0(71-'), 



VeiiY,.2=n-^c-^Fo{to,zo)g{to) J ^{y) dy + 0{n-^/''). 
Furthermore we have 

Gov r,;2) =n-ic-iFo(to, ^0)5(^0) j k'iy) dy + &{n-^'^), 



so that 



-Fo(to,^o)g(io) / \ \c^m2{k)dl{Fn{tQ,zn)g{t^)] 



VVarFi ==c"ig(io) / k{uf du { ^ 



+ 0(n-i/5) = Si + 0(^-1/5). 



Here we denote by VarY^ the covariance matrix of the vector Fj. By the Lindeberg-Feller 
central hmit theorem we then get 



^Er=ifca„(io-T^O \ I 9{to) 

TrT.7=iMo,zo]{Zi)A,k^,Xto-%) ) \ Fo{to,zo)g{t„) 

^c^m2{k)l ^ I =f](y,-Ey,)+0(l)-^AA(O,Ei). (22) 



2 



c'f{J^o(to, ^0)5(^0)} 



For the pointwise asymptotic result of Fn^\ note that 



Fi'Hto,zo)^^ 



/l " 1 " 

\ i=l 1=1 



for (j){u,v) —v/u. Now applying the Delta- method to (22 1 gives 



n2/5(i^,W(io, ^o) - Foih, zo)) AA(Aii,a2) 



where ni and tr are defined in (10 1 and (11). □ 
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Proof of Theorem^ For i — 1,2, let Nn\tQ, zq) be the numerator in the definitions Q and 



3|) of Fn\tQ, Zq) at a fixed point {to, zq), and note that we can write 



Ni^\t0,ZQ) = -V'l[o,zo-/J„](^i)^i / ka^,i3^{to~Ti,Z~ Zi)dz 

+ -y2^i^o-0r^,zo](^^)^-^ / fca„./3,.(to-7i,Z-Z,)dz 
X! -^(2o,2o+/3„](^'0^i / ka„,i3„{to-Ti,Z~ Zijdz 

" , Jo 
= -^Mo,zo-i3,,]iZi)X J ka^^pSto-Ti^z ~ Zi)dz 

+ -y2^(zo-0r^.z„]iZi)\ / ka„,l3„{tQ~T„Z~ Zi)dz 

n ^ Jo 

--X!-^(-'o-/9„.zo](^«)^» / ka^^ii^{to-T^,z - Zi)dz 

1 Z"^" ~ 

^ l(zo,zo+/9„](^0^i / fca„,/3„(^0 - 7^,2 - 

1 1 

= -^^o,zo]^Z^)A,ka„{to -T^,zo) ^ l(^(,_^^_^g](Zi)A, / ka„,f3„ito - T^, z - Zi) dz 

i i •'^0 

H y2^i^o,zo+f!,^]{Z^)^i / ka^,i3^{to-Ti,Z - Z{)dz. 

In the last equality we use {K.\), so that 

i?„ = N^^\to,zo)- Nl^\to,zo) = -y^^,^^,^+fi^^{Z,)A, k^^^pSto-T,,z~ Z,)dz 

,^1 Jz^-ii^ 

" t=l "'^0 " i=l 

First we consider the variance of ri^/^_R„. Observe that 



rzo 

\Ui\ < l(^^,2^+^„](Zi)Aj / ka^^p^{to-Ti,z- Zi)dz 

Jz,~l3„ 

+^(zo-p„,zo]{Zi)\ / ka„,is„{to-Ti,z- Z,)dz 

J Zq 

l-Zi+p„ 

< ^izo.zo+M{Zi)A, ka„,i3„{to-T,,z- Z^)dz 

J Z,-I3„ 

rZ,+p„ 

+ '^(z„-f!„.z„]{Zi)Ai / ka^.l3„{to-Ti,Z- Z,)dz 

JZi-p,^ 

= l(zo-/J„,zo+/9„](^i)^j^a„(^0 " ■— Si, 

with 

^Sf= / / l(^„_^„_2Q+^„](i;)fc^^(to - = Q;~V„2/ii(to,zo) / fc^(a;) da; + 0(/3„). 
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Since 

Vari?„ = -Var?7i = -{EU^ - {EUif} < -ES^ = 0{n-^a-^pn), 
n ' n 

Var ri^/^Rn — > for = n^'^/^ and /3„ = with /3 > 0. 
Now wc consider the expectation of Ui. 



EUi =11 l(zo,^o+;8n](^) / k„„,0Sto-u,z -v)dzhi{u,v)dudv 

-/ / l(zo-/3„,zol(^) / ka„,i3^{to-u,z-v)dzhi{u,v)dudv 

JvJu J Zo 



I Zo 

/•O rl ry 

= Pn / / k{x,w) dwhi{to - anX,zo - I3ny)dxdy 

y— — 1 J X——1 J w——\ 
nl 1-1 1-1 



-Pn / / / k{x,w) dwhi{to - anX, Zo - PnU) dx dy 

J y—0 J a;— — 1 J w—y 

= Pnhi{to^zo) < / / / k{x,w) dw dy dx — / / / k{x,w)dwdydx> 

\.Jx=—lJy=—lJw=—\ J x=—l J y=0 J w=y ) 

( fy , 

—PnOCndihi{to^ Zo) < / / xk{x, w) dw dy dx — I I I xk{x, w) dw dydx 

KJx — —lJy——lJw— — l J X——1 J y—0 J w=y 

—^^d2hi{to,ZQ)< / / / yk{x^w) dw dy dx — / / / yk{x^w) dw dy dx> 

[^Jx—~lJy——lJw— — l J x=— 1 •/ y—0 •/ w=y ) 

+0{(3^al) +0 (Plan) +0 (131) 



= —Pnhi{to,zo) / / wk{x,w) dw dx -\- Pn<^ndihi{to, Zo) / / xwk{x,w)dwdx 

Jx= — 1 Jw=—1 J x=—l Jw=—1 

+ ll3ld2hi{to,zo) [ I w''k{x,w)dwdx + 0{pnal)+0{Plan)+0{l3l) 

where the last equahty follows from changing the order of integration. By condition {K.3), the 
first two integrals are zero and the last integral equals m2{k), so that 



En'^/^R„- 



±oo for 13 < 1/5, 

lclm2{k)g{to)d^Fo{to, zq) for /3 = 1/5, 
for /3 > 1/5. 



Applying Slutsky's Lemma to 

n'/'{F^'\to,zo)-F(^\to,zo)) = '^ 

9n\j0) 

gives the result. □ 
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Figure 1: Two examples of the estimator Fn for a sample of size n = 10 (left panel) and of size 
n = 100 (right panel), k{x) = |l[_i.i](x), q;„ — 0.65 (left panel) and a„ — 0.40 (right panel), 
Fo{x,y) = xy on [0,1]^ and g{t) = l[o,i](i)- 



Normal Q-Q Plot 




-3 -2 -1012 3 -1 .0 -0.5 0.0 0.5 1.0 1 .5 

Theoretical Quantiles 



Figure 2: QQ-plot (left panel) and histogram (right panel) of to = 1 000 values 
n2/5(FW(o.5^0.5) - Fo(0.5,0.5)) for n = 5000, k{y) = |(1 - 2/')l[-i.i](2/), = 0.09, 
Fq{x, y) = }^xy{x + y) and g{t) = 2t, illustrating Theorem 3 
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5000 1 0000 1 5000 20000 25000 



Figure 3: Values of n'^/^ {Fi^\o.5,0.5) ~ Fi^\o.5,0.5)) as a funetion of ti for k{x,y) k{x)k{y), 
k(u) = 1(1 - u2), a„ = /3„ = ^n'^'^, Fo{x,y) = ^xy{x + y) and (7(i) = 2t. 




Figure 4: QQ-plot (left panel) and histogram (right panel) of to = 1 000 values 
^2/5 (-/iW (Q 5^0.5) -i^o (0.5, 0.5)) forn = 5000, k{x,y) = k{x)k{y), k{u) = l{l-u^), a„=0.091, 
/3„=0.029, FQ{x,y) — \xy{x + y) and g{t) = 2t, illustrating Corollary l5l 
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n 




MSB (s.e.) 




io:n,Pn) 


MSB (s.e.) 
























(0.4,0.4) 


500 


0.20 


5.14-10^4 


(5.10-10- 




(0.20,0.25) 


4.43-10-4 


(4.33-10- 






1000 


0.20 


3.31-10-'' 


(3.04-10- 




(0.20,0.15) 


3.10-10-4 


(3.06-10- 






5 000 


0.15 


8.09-10-^ 


(8.47-10- 




(0.15,0.10) 


7.74-10-'^ 


(8.28-10- 


-6) 




10 000 


0.15 


4.50-10-5 


(3.43-10- 




(0.15,0.05) 


4.50- lO-'"^ 


(3.38-10- 


-6) 


(0.6,0.6) 


500 


0.25 


8.21-10-4 


(7.48-10- 




(0.25,0.15) 


7.82-10-4 


(7.04-10- 






1000 


0.20 


5.31-10-4 


(4.34-10- 




(0.20,0.05) 


5.31-10-4 


(4.33-10- 






c Ann 

5 000 


0.15 


1.21-10-4 


(9.98-10- 


) 


/A 1C A ACtA 

(0.15,0.05) 


1.21-10-4 


(9.88-10- 


) 




10000 


0.15 


9.21-10-5 


(7.41-10- 


-6) 


(0.15,0.05) 


9.14-10-5 

pMS 

n 


(7.31-10- 


-6) 


(0.4,0.4) 


500 


0.200 


5.56-10-4 


(4.56-10- 




(0.250,0.250) 


7.21-10-4 


(5.58-10- 






1000 


0.100 


3.26-10-4 


(2.83-10- 




(0.200,0.500) 


3.48-10-4 


(3.30-10- 






5 000 


0.100 


1.10-10-4 


(9.98-10- 


-6) 


(0.200,0.333) 


7.20-10-5 


(7.11-10- 






10000 


0.067 


6.38-10-5 


(4.82-10- 


-6) 


(0.167,0.333) 


7.35-10-5 


(6.45-10- 


-6) 


(0.6,0.6) 


500 


0.200 


1.45-10-3 


(1.35-10- 


-4) 


(0.250,0.250) 


5.51-10-4 


(5.28-10- 






100 


0.250 


3.59-10-3 


(1.97-10- 


-4) 


(0.250,0.200) 


4.13-10-4 


(3.40-10- 






5 000 


0.333 


1.54-10-2 


(2.03-10- 


-4) 


(0.250,0.167) 


2.24-10-4 


(5.66-10- 






10000 


0.333 


1.50-10-2 


(1.48-10- 


-4) 


(0.250,0.200) 


1.32-10-4 


(7.23-10- 


-6) 



Table 1: Minimum values of the estimated MSB of the estimators F^\ F^\ F„ and F^^ for 
different values of n at different points {to,zo) for the simulation study. The values of an and 
l3n that resulted in these minimal values are also given, as well as the standard errors of the 
mean of the squared differences between the estimator and the true value. 
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0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 



0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 



(a) n = 500 with /3„ = 0.15 for 



(2) 



(b) n = 1 000 with /3„ = 0.05 for 



(2) 





0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 



0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 



(c) n = 5 000 with /3„ = 0.05 for F^' 



(2) 



(d) n = 10 000 with /3„ = 0.05 for F, 



p(2) 



Figure 5: The estimated MSB as function of the smoothing parameter an the estimators Fn^ 
(dotted line) and (dashed line) for different values of n at the point (0.6,0.6) for the 
simulation study. 
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0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 



(1) 

Figure 6: Values the estimated MSEs MSE (a„; 0.5, 0.5) (dotted line), 

. .(2) (1) 

MSE (a„, 0.78; 0.5, 0.5) (solid line), MSE (a„; 0.5, 0.5) (dash-dotted line) and 

'(2) 

MSE (a„, 0.8; 0.5, 0.5) (dashed line) as function of «„. 
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